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c/3 . Abstract 

^<\ • Cognitive radios have been proposed as a means to implement efficient reuse of 

^^ , the hcensed spectrum. The key feature of a cognitive radio is its abihty to recognize 

^-^ I the primary (hcensed) user and adapt its communication strategy to minimize the 

interference that it generates. We consider a communication scenario in which the 
^^ , primary and the cognitive user wish to communicate to different receivers, subject 

\^ I to mutual interference. Modeling the cognitive radio as a transmitter with side- 

^D ' information about the primary transmission, we characterize the largest rate at 

c/3 . which the cognitive radio can reliably communicate under the constraint that (i) 

no interference is created for the primary user, and (ii) the primary encoder-decoder 

pair is oblivious to the presence of the cognitive radio. 



1 Introduction 



Observing a severe under-utilization of the licensed spectrum, the FCC has recently 
recommended [3 |HI that significantly greater spectral efficiency could be realized by de- 
ploying wireless devices that can coexist with the incumbent licensed (primary) users, 
generating minimal interference while somehow taking advantage of the available re- 
sources. Such devices could, for instance, form real-time secondary markets [H] for the 
licensed spectrum holders of a cellular network or even, potentially, allow a complete 
secondary system to simultaneously operate in the same frequency band as the primary. 
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The characteristic feature of these cognitive radios would be their abihty to recognize their 
communication environment and adapt the parameters of their communication scheme to 
maximize the quahty of service for the secondary users while minimizing the interference 
to the primary users. 

In this paper, we study the fundamental limits of performance of wireless networks 
endowed with cognitive radios. In particular, in order to understand the ultimate system- 
wide benefits of the cognitive nature of such devices, we assume that the cognitive radio 
has non-causal knowledge of the codeword of the primary user in its vicinity^; in this, 
we are motivated by the model proposed in [H]. We address the following fundamental 
question: 

What is the largest rate that the cognitive radio can achieve under the constraint that 



(i) it generates no interference for the primary user in its vicinity, and 

(ii) the primary receiver uses a single-user decoder, just as it would in the absence of 
the cognitive radio? 



We will refer to these two imperative constraints as the coexistence conditions that a 
cognitive secondary system must satisfy. 

Of central interest to us is the communication scenario illustrated in Fig. ^ The 
primary user wishes to communicate to the primary base-station Bp. In its vicinity is a 
secondary user equipped with a cognitive radio that wishes to transmit to the secondary 
base-station Bg. We assume that the cognitive radio has obtained the message of the 
primary user. The received signal-to-noise ratio of the cognitive radio's transmission at 
the secondary base-station is denoted by SNR. The transmission of the cognitive radio is 
also received at Bp, and the signal-to-noise ratio of this interfering signal is denoted by 
INR (interference-to-noise ratio). If the cognitive user is close to Bp, INR could potentially 
be large. 

Our main result is the characterization of the largest rate at which the cognitive 
radio can reliably communicate with its receiver Bg under the coexistence conditions and 
in the "low-interference-gain" regime in which INR < SNR. This regime is of practical 
interest since it models the realistic scenario in which the cognitive radio is closer to Bg 
than to Bp. Moreover, we show that the capacity achieving strategy is for the cognitive 
radio to perform precoding for the primary users's codeword and transmit over the same 
time-frequency slot as that used by the primary radio. 

To prove our main result, we allow the primary and secondary systems to cooperate 



^Note that this does not imply that the cognitive user can decode the information that the primary 
user is communicating since there are secure encryption protocols running at the application layer. The 
decoded codeword is a meaningless stream of bits for the cognitive user. 
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Figure 1: A possible arrangement of the primary and secondary receivers, base-stations 
Bp and Bg, respectively. The cognitive secondary user is represented by the circle and 
the primary user is represented by the square. The side-information path is depicted by 
the dotted line. 

and jointly design their encoder-decoder pairs and then show that the optimal commu- 
nication scheme for this cooperative situation has the property that the primary decoder 
does not depend on the encoder and decoder used by the secondary system. This co- 
operative communication scenario can be thought of as an interference channel PP, |l(ij . 
[4j but with degraded message sets^: Achievable schemes for this channel have been first 
studied in |6j . A related problem of communicating a single private message along with 
a common message to each of the receivers has been studied in [12] . 



Furthermore, we exhibit a regime in which joint code design is beneficial when one 
considers the largest set of simultaneously achievable rates of the primary and cognitive 
users. We show that, unlike in the low-interference-gain regime, knowledge of the code 
used by the cognitive radio is required by the primary decoder in order to achieve all the 
rates in the capacity region of this interference channel when INR ^ SNR. 

The rest of this paper is organized as follows. We first present the Gaussian cognitive 
channel in Section |21 We state our main result, the capacity of the cognitive channel 
in the low-interference-gain regime INR < SNR, in Sectional The proof of our main 
result is given in Section |3J where we demonstrate the capacity region of the underlying 
interference channel with degraded message sets which inherently allows for joint code 
design. We then show that the benefit of joint code design becomes apparent in the high- 
interference-gain regime INR ^ SNR; this is done in Section [4.2.51 Finally, we study the 



^The primary radio has only a subset of the messages available to the cognitive radio. 



system-level implications of the optimal cognitive communication scheme in Section |31 



2 The Channel Model and Problem Statement 



2.1 The cognitive channel 



Consider the following communication scenario which we will refer to as the cognitive 
channel. 
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Figure 2: The (Gaussian) cognitive channel after n channel uses. The dashed lines 
represent interfering receptions. The dotted line represents the side-information path. 
The power constraints are Pp and Pc and noise variances are Np and Ng. 



The additive noise at the primary and secondary receivers, Z^ := {Zpi, Zp_2, . . . , -^p,n) 

and Z" := (2'^ i, Z^ 2, • • • , Zs^n)i is assumed to be i.i.d. across symbol times i = 1,2, . . .n 
and distributed according to Af{0,Np) and Af{0,Ns), respectively^. The correlation be- 
tween Zp and Z" is irrelevant from the standpoint of probability of error or capacity 
calculations since the base-stations are not allowed to pool their signals. The primary 
user has message rup G {0, 1, . . . ,2"-^p} intended for the primary receiver to decode, the 
cognitive user has message rric G {0, 1, . . . ,2""^"} intended for the secondary receiver as 
well as the message rUp of the primary user. The average power of the transmitted signals 
is constrained by Pp and Pc, respectively: 



IX"lr < nP 
\^^p II ^ "'-' pi 



|X"II^ < nPr. 



The received signal-to-noise ratios (SNRs) of the desired signals at the primary and 
secondary base-station are p^Pp/Np and c^Pc/Ng, respectively. The received SNRs of 

■^Throughout the paper we wiU denote vectors in R" by X" :— {Xi, X2, ■ ■ ■ , Xn) 



the interfering signals at the primary and secondary base-station (INRs) are pPc/Np 
and g'^Pp/Ns, respectively. The constants {p, c, /, g) are assumed to be real, positive and 
globally known. The results of this paper easily extend to the case of complex coefficients 
(see Section ESI)- The channel can be described by the pair of per-time-sample equations 

Yp = pXp + fX, + Zp, (2) 

n = gXp + cX, + Z,, (3) 

where Zp is 7V(0, Np) and Zs is A/'(0, A^"^). 



2.2 Transformation to standard form 



We can convert every cognitive channel with gains {p, f, g, c), power constraints {Pp, Pc) 
and noise powers {Np, Ng) to a corresponding standard form cognitive channel with gains 
(l,a,6, 1), power constraints {Pp,Pc) and noise powers (1,1), expressed by the pair of 
equations 

(4) 
(5) 

where 



Y — X 

Ip — y\p 


+ aXc + Zp, 


Ys = bX, 


, + X, + Zs, 


fVNs 


P^Ns 


p'Pp 

■ Np ' 


p ._ <^'Pc 

'■ Ns- 



(6) 

The capacity of this cognitive channel is the same as that of the original channel since 
the two channels are related by invertible transformations^ that are given by 

Y P P V P 7 P ■ (1\ 

V p V p V p 
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In deriving our main result we will consider this standard form of the cognitive channel 
without loss of generality and we will refer to it as the cognitive (1, a, b, 1) channel. 

"'These transformations were used in ^, [3] and ^S], in the context of the classical interference 
channel. 
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Figure 3: The cognitive channel in standard form. The channel gains {p,f,g,c) in the 
original channel are mapped to (l,a,6, 1), powers {Pp,Pc) are mapped to {Pp,Pc), and 
noise variances {Np,Ns) are mapped to (1, 1). 



2.3 Coding on the cognitive channel 

Let the channel input alphabets of the primary and cognitive radios be A'p = M and 
Xc = ^, respectively. Similarly, let the channel output alphabets at the primary and 
secondary receivers be 3^p = M and 3^^ = M, respectively. 

The primary receiver is assumed to use a standard single-user decoder to decode rup G 
{1,2,..., 2"'^p} from K", just as it would in the absence of the secondary system: Any 
decoder which achieves the AWGN channel capacity, such as the maximum-likelihood 
decoder or the joint-typicality decoder, will suffice. Following standard nomenclature, 
we say that Rp is achievable for the primary user if there exists a sequence (indexed by 
n) of encoding maps, E^ : {1, 2, . . . , 2"'^p} \-^ X^, satisfying ||Xp |p < nPp, and for which 
the average probability of decoding error (average over the messages) vanishes as n — ^ oo. 

The cognitive radio is assumed to have knowledge of rUp, hence we have the following 
definition: 

Definition 2.1 (Cognitive code) A cognitive {2"'^'',n) code is a choice of an encoding 
rule (whose output we denote by X^) 

E^ : {1,2,...,2"^^} X {1,2, ...,2"^=}^ A';^, (9) 

such that ||X"||^ < nPc, and a choice of a decoding rule 

D: : X"^{1,2,...,2"^=}. (10) 

The following key definition formalizes the important notion of coexistence conditions 
that the cognitive secondary system must satisfy. 
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Definition 2.2 (Achievability: cognitive user) A rate Re is said to be achievable for 
the cognitive user on a cognitive {l,a,b, 1) channel if there exists a sequence of cognitive 
{T^^^^n) codes such that the following two constraints are satisfied: 



1. The average probability of error vanishes as n ^ oo, i.e., 

PS = ^;^ i: nD-{Y:) + jVp = ^, -c = j) - 0; (11) 

2. A rate of RL = \ log(l + P^ is achievable for the primary user. 



Definition 2.3 (Capacity) The capacity of the cognitive channel is defined to be the 
largest achievable rate Re for the cognitive user. 



Our main result, presented in the following section, precisely quantifies the capacity of 
the cognitive channel in the "low-interference-gain" regime. 



3 The Main Result 



If the received SNR of the cognitive radio transmission is lesser at the primary receiver 
than at the secondary receiver, we say that the primary system is affected by a low 
interference gain. This is the case that is most likely to occur in practice since the 
cognitive radio is typically closer to its intended receiver (the secondary base-station) than 
to the primary base-station. In terms of the parameters of our problem, this situation 
corresponds to fy/N^ < c^JTlp in our original cognitive channel, or, equivalently, to a < 1 
in the corresponding standard-form cognitive (1, a, 6, 1) channel. Our main result is an 
explicit expression for the capacity of the cognitive channel in this regime. 



Theorem 3.1 The capacity of the cognitive (l,a, 6, 1) channel is 

i?: = ^iog(i + (i-«*)p,), (12) 

as long as a <1. The constant a* G [0, 1] is defined in ( |i7D . 



Note that Theorem 13.11 holds for any 6 G M (or equivalently any p, (7 G M in the original 
cognitive channel). 



4 Proof of the Main Result 



4.1 The forward part 

To show the existence of a capacity-achieving cognitive (2"^'=,n) code, we generate a 
sequence of random codes such that the average probabihty of error (averaged over the 
ensemble of codes and messages) vanishes as n — > oo. In particular, we have the following 
codes: 



Ep ensemble: Given nip G {1, 2, . . . , 2"^^}, generate the codeword X'^ G M" by 
drawing its coordinates i.i.d. according to Af{0, Pp). 

E"^ ensemble: Since the cognitive radio knows rup as well as Ep, it can form X" 
and perform superposition coding as follows: 



x: = x: + j—^x:, (13) 




where a G [0, 1]. The codeword X" encodes mc G {1, 2, . . . , 2"^'=} and is generated 
by performing Costa precoding jHj (also known as dirty-paper coding) treating (& + 

y/a^)Xp as non-causally known interference that will affect the secondary receiver 

in the presence of A/'(0, 1) noise. The encoding is done by random binning 0. 

• D"^: Costa decoder (having knowledge of the binning encoder E^) j2j. 

The key result of Costa ^ is that, using the dirty-paper coding technique, the max- 
imum achievable rate is the same as if the interference was also known at the receiver, 
i.e., as if it were absent altogether. The characteristic feature of this scheme is that 
the resulting codeword X" is statistically independent of X^ and is i.i.d. Gaussian. To 
satisfy the average power constraint of Pc on the components of X", each coordinate of 
X^ must, in fact, be Af{0, (1 — a)Pc). Hence, the primary receiver can treat X^ as inde- 
pendent Gaussian noise. Using standard methodology, it can be shown that the average 
probability of error for decoding rup (averaged over the code ensembles and messages) 
vanishes, as n ^ cx), for all rates Rp below 

; iV^ + aVoKf] 

l + a2(l-«)P, J' ^ ^ 

Similarly, the average probability of error in decoding nic vanishes for all rates Re below 

^log(l + (l-«)P,). (15) 




However, in order to ensure that a given rate is achievable for the cognitive user in the 
sense of Definition 12.21 we must have that 

= ^log(l + Pp)=:i?;. (16) 

Observe that, if a = 0, any choice of a G [0, 1] will satisfy (fTI)|) : in this case we should 
set a* = to maximize the rate achievable for the cognitive user. For < a < 1, by the 
Intermediate Value Theorem, this quadratic equation in a always has a unique root in 

[0,1]: 




//^ (v^l + a2p,(l + Pp 



«* = ^ 7W7T-^r. ■ (17) 




a^Pcll + Pp 



Finally, since the code-ensemble-averaged (and message-averaged) probabilities of er- 
ror vanish, there must exist a particular sequence of cognitive codes and primary en- 
coders for which the (message-averaged) probabilities of error vanish as well. Hence, 
R* = ^log(l + (1 — a*)Pc) is achievable for the cognitive user in the sense of Defini- 
tion O 



4.2 The converse part 
4.2.1 Proof outline 



In order to prove the converse to our main result we will first relax the constraints of 
our problem and allow for joint primary and cognitive code design. This relaxation 
leads naturally to an interference channel with degraded message sets^, which we will 
abbreviate as IC-DMS for convenience. 

Our approach is to first characterize the capacity region of the IC-DMS, i.e., the 
largest set of rate tuples {Rp, Re) at which joint reliable communication can take place. 
We then make the key observation that the joint coding scheme that achieves all the 
rate tuples in the capacity region of the IC-DMS has the property that the decoder at 
the primary receiver is a standard single-user decoder. Furthermore, we show that there 
exists a point {Rp, Re) = {Rp, R*) on the boundary of the capacity region of the IC-DMS, 
where P* = | log(l + Pp) and P* = | log(l + (1 - a*)Pc) with a* given hj ^. We then 
conclude that Re = R* is the capacity of the corresponding cognitive channel. 

^Thc primary user knows trip while the cognitive user knows {irip, md, hence the primary user has a 
subset of the messages available to the cognitive user. 



4.2.2 Joint code design: The IC-DMS 

The input-output equations of the IC-DMS, as for the cognitive channel, are given by 
0, © with the standard form given by (JH), ©. We will denote the IC-DMS in standard 
form by "(1, a, b, 1)-IC-DMS" . 

Definition 4.1 (IC-DMS code) A (2"-^^^ 2"^^ n) code for the {l,a,b,l)-IC-DMS %s a 
choice of an encoding rule and a decoding rule: The encoding rule is a pair of maps 
(whose outputs we denote by X^ and X", respectively) 

e; : {1,2,...X'^}^X;, (18) 

e: : {l,2,...,2"^-}x{l,2,...,2"^^}^A',", (19) 

such that ||Xp IP < nPp and ||X"|p < nPc. The decoding rule is a pair of maps 

d; : 3^; ^{1,2,..., 2"^-}, (20) 

d"^ : 3;; ^{1,2,..., 2"^-}. (21) 



Given that the messages selected are {rrip = i,mc = j), an error occurs if dp{Yp) ^ i or 



Definition 4.2 (Achievability: IC-DMS) A rate vector {Rp, Re) is said to be achiev- 
able if there exists a sequence of {2^^'^ , 2"'^" , n) codes such that the average probability of 
error at each of the receivers vanishes as n —>■ cxd, i.e., 

~ 1 " 

PS = l^^m:^) E nd;{Y-) ^Amp = z, m. = J) -. 0, (22) 

1 " 

PS = iMn:^) E P(^e(n")^jK = ^,^c = j)-0. (23) 



2"- 



Definition 4.3 (Capacity region) The capacity region of the IC-DMS is the closure 
of the set of achievable rate vectors {Rp,Rc). 



4.2.3 The capacity region of the IC-DMS under a low interference gain 

The following theorem characterizes the capacity region of the (l,a, 6, 1)-IC-DMS with 
a < 1 and arbitrary 6 G M. 
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Theorem 4.1 The capacity region of the {l,a,b,l)-IC-DMS with a < 1 and b & M. is 
given by the union, over all a G [0, 1], of the rate regions 



% + ay/aP'cy 



0<-R. < ilog(l + (l-a)Pc). (25) 



Proof of achiev ability: The random coding scheme described in the forward part of the 
proof of Theorem l3.1l (Section l4.1jl achieves the rates ()24|1 and (J25|) stated in the theorem. 
We emphasize that, in this scheme, the primary receiver employs a single-user decoder. 



Proof of converse: See Appendix 1X1 



4.2.4 The capacity of the cognitive channel under a low interference gain 

The proof of Theorem 14.11 reveals that the jointly designed code that achieves all the 
points on the boundary of the capacity region of the IC-DMS is such that the primary 
receiver uses a standard single- user decoder, just as it would in the absence of the cognitive 
radio. In other words, the primary decoder dp does not depend on e" and d^. Thus, in 
order to find the largest rate that is achievable by the cognitive user in the sense of 
Definition 12. 21 we can without loss of generality restrict our search to the boundary of the 
capacity region of the underlying IC-DMS. Hence, to find this capacity of the cognitive 
channel, we must solve for the positive root of the quadratic equation ^T^ in a. The 
solution is given by a* in (|T7jl . hence the capacity is 

/?: = ^log(l + (l-«*)Pe). (26) 

Thus we have established the proof of Theorem IH.ll D 

The proof of the converse of Theorem 14.11 allows us to characterize the sum-capacity 
of the (1, a, b, 1)-IC-DMS for any a > 1 and the entire capacity region if a is sufficiently 
large. These two ancillary results are shown in the following section. 



4.2.5 The high-interference-gain regime 
The sum-capacity for a > 1 
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Corollary 4.1 The maximum of Rp + Re over all {Rp, Re) in the capacity region of the 
(1, a, b, 1)-IC-DMS with a > 1 and b eW is achieved with a = 1 in [^] and ^25\. i.e., 



^log('l+(/?^ + av^)'V 



C.„™(a) = -log 1+ v^ + a/^ . (27) 



Proof: See Appendix IBI 

Contrary to the development so far, in the following section we will observe that, in 
the very-high-interference-gain regime, the optimal (jointly designed) IC-DMS code is 
such that the primary decoder c?" depends on the cognitive encoder e". 

The benefit of joint code design 

When the interference gain at the primary receiver due to the cognitive radio transmis- 
sions (parameter a) is sufficiently large, the optimal decoder at the primary receiver of 
the IC-DMS is one that decodes the message of the cognitive user before decoding the 
message of the primary user. 

First, we demonstrate an achievable scheme in the following lemma. 



Lemma 4.2 Consider the cognitive (1, a, 6, 1) -interference channel. For every a G [0, 1], 
the rate pair {Rp, Re) satisfying 

Rp = Rp{a) = ilogM + (v^ + av/^)''), (28) 



def 



1 / (1 - a)P, 



Re = Re{a) = -log 1 + -— ^^^^^p=- , (29) 



2 °\ i + (5^+y^^2 
is achievable as long as 



- K^a)' ^ y^^""^ + Pp{^ + ib^/Vp + v/^)^ 



(30) 



def 



where K{a) = 1 + b'^Pp + 2bJaPpPe. 



Proof: The primary transmitter forms X" by drawing its coordinates i.i.d. according to 
A/'(0, Pp). Since the cognitive radio knows rUp and e^ it forms X^ then generates X^ by 
superposition coding: 



' aP 

r — ^r + A / ~JS~ P ' 
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where X" is formed by drawing its coordinates i.i.d. according to A/'(0, a/(1 — a)Pc) 
for some a G [0,1]. The decoder rf" at the primary receiver first decodes rric treating 
(1 + a^yaPc/Pp)Xp as independent Gaussian noise. It then reconstructs aXJ} (which it 
can do because it knows e") and subtracts off its contribution from Y^ before decoding 
nip. The decoding rule d^ at the secondary receiver is simply to decode rric treating 
(6+ ^JaPc/ Pp)X'^ as independent Gaussian noise. The rates achievable with this scheme 
are then exactly given by (j^H|) and (j^^ . provided that the rate at which the primary 
receiver can decode the cognitive user's message is not the limiting factor, i.e., 

(1 - a)P, ^ a\l - a)P, 



l + {b^p + V^y l+{^p + ay/^) 



2- 



Solving this quadratic inequality for a, we find that the condition is satisfied only when 
a satisfies inequality pO|) stated in the theorem. D 



Theorem 4.3 A point {Rp,Rc) is on the boundary of the capacity region of the cognitive 
(1, a, 6, 1) -interference channel if there exists a G [0, 1] such that 

1. {Rp,Rc) = {Rp{a) , Rc{a)) where Rp{a) and Rc{a) are defined in pi^ and (f^ . 
respectively, 

2. a and b satisfy the condition given in (j^/^ . and 

3. b < &max(/^a, o) whcrc /ia = — > "/^z' and bjns,^{fi,a) is defined in AppendixlU. 

dRp(x) x=a 



Proof of achiev ability: Given in Lemma 14.21 
Proof of converse: Given in Appendix O 

Observe that Theorem 14. 31 characterizes the entire capacity region of the (1, a, 6, 1)-IC- 



DMS with a > ^/P^JK{1) + Jk{1) + Pp{l + (b^+y/Ky) and b < 6max(/ia, a). 



5 System-level Considerations 



In this section we use our results on the capacity-achieving cognitive communication 
scheme to derive insight into a practical implementation of cognitive radios. 
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5.1 Properties of the optimal scheme 

5.1.1 Avoiding the "hidden-terminal" problem 

The network of Fig. Q] models the situation in which the geographic location of Bg is not 
assigned in accordance with any centralized cell-planning policy and it can be arbitrarily 
close to Bp. Consequently, the secondary users that are in close proximity to Bp could 
potentially cause significant interference for the primary system if the secondary system 
is to operate over the same frequency band. 

One possible adaptive communication scheme that the cognitive radio could employ 
in order to avoid interfering with the primary user in its vicinity would be to restrict its 
transmissions to only the time-frequency slots which are not occupied by the signals of the 
detected primary radio. Indeed, this idea of "opportunistic" orthogonal communication 
was what led to the birth of the notion of cognitive radio. However, one drawback of 
such a protocol is that the cognitive radio would very likely cause interference to other, 
more distant, primary users whose presence - i.e., time-frequency locations - it could 
not detect. The degradation in overall performance of the primary system due to this 
"hidden-terminal" problem could potentially be significant^, especially in the context of 
OFDMA ^, jini where the primary users are allocated orthogonal time-frequency slots 
and the SINR required for decoding is typically large. 

Contrary to this, we find that the optimal strategy is for the cognitive radio to si- 
multaneously transmit in the same frequency slot as that used by the primary user in 
its vicinity. An immediate benefit of this scheme is that, if the transmissions of different 
primary users are mutually orthogonal, the cognitive radio can only (potentially) affect 
the performance achievable by the primary radio whose codeword it has decoded. Fur- 
thermore, we know that a proper tuning of the parameter a can, in fact, ensure that the 
primary user's rate is unaffected. 



5.1.2 Robustness to noise statistics 

All our results have been derived under the assumption that the noise affecting the 
receivers, Z" and Z^, is i.i.d. Gaussian. In P^ it was shown that using a Costa encoder- 
decoder pair that is designed for additive i.i.d Gaussian noise on a channel with arbitrary 
(additive) noise statistics will cause no loss in the achievable rates. ^ Combined with the 
similar classical result for the standard AWGN channel [TT], we see that the maximal 
rate expressed in Theorem Kill is achievable for all noise distributions. 

^Classical RTS/CTS solutions to this problem are not viable since they require that the primary 
system ask for access to the very spectrum that it owns. 

^Note that this is an achievability result: The capacity of the channel with this arbitrary noise could 
be larger but a different code would be required to achieve it. 
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5.2 Obtaining the side- information 

In practice, the cognitive radio must obtain the primary radio's codeword in a causal 
fashion - its acquisition thus introducing delays in the cognitive radio transmissions^. In 
a typical situation, due to its relative proximity to the primary user, the cognitive radio 
can receive the primary transmissions with a greater received SNR than that experienced 
by the primary receiver. Hence, it seems plausible that the cognitive radio could decode^ 
the message of the primary user in fewer channel uses than are required by the primary 
receiver. Recent work in distributed space-time code design ^3] indicates that this over- 
head decoding delay is negligible if the cognitive radio has as little as a 10 dB advantage 
in the received SNR over the primary receiver. 



5.3 Extension to complex baseband 

The results of this paper can easily be extended to the case in which the channel gains 
are complex quantities, i.e., p,f,g,cE C in the case of the original cognitive {p,f,g,c) 
channel with power constraints {Pp,Pc) and noise variances {Np,Ns), as defined in Sec- 
tion 12.11 However, the optimal cognitive encoder rule ()1H|1 must change slightly: The 
superposition scheme takes the form 




^c=X: + ^^e^'^Jay^X;, (31) 

where p = \p\e^^p. The codeword X" is again generated by Costa precoding, but the 
assumed interference at the secondary receiver is now 

(32) 

and the assumed noise is CM'{0,Ns/\c\'^). The factor e^^^ in ()31|1 essentially implements 
transmit beamforming to the primary receiver, hence ensuring that all the rates given by 

n<'R ^ 1 M , {\p\v^P+\f\V^T \ ,,,, 




0<R, < log ( 1 + ^^^i^^) , (34) 

are achieved in the underlying IC-DMS. As before, we can then choose a = a* (deter- 
mined by (fTTjl ). so that R* = log(l + |cp(l — a*)Pc/Ns) is achievable in the spirit of 
Definition O but with R; = log(l + \p\'^Pp/Np). 

^Under a half-duplex constraint the cognitive radio must first "listen" in order to decode the primary 
message before it can use this side-information for its own transmission. 
^The cognitive radio is assumed to know the encoder of the primary user. 
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5.4 Communicating without channel-state feedback from the 
primary base-station 

In order to perform the complex base-band superposition coding scheme fj3ip and, imphc- 
itly, the Costa precoding for known interference ((221), ^^le cognitive radio must know each 
of the four parameters g, c, / and p, both in magnitude and phase. To obtain estimates 
for p and /, the cognitive radio would require feedback from the primary base-station. 
In section Section 15.51 we discuss ways in which the estimation and feedback of these 
parameters could be implemented. In this section, however, we present an alternative 
(suboptimal) scheme which requires no feedback from the primary base-station. 

Suppose that, after having decoded X^ , the cognitive radio transmits the following 
n-symbol codeword: 



Pr 



X: = X: + ^la^X;, (35) 



Pp 



where the codeword X" is generated by Costa precoding for the interference 

(36) 




c 
assuming the presence of CA/'(0, iVs/|cp) noise at the secondary base-station. 

• Obtaining c: The parameter c could be estimated at the secondary base-station 
by using the cognitive radio's pilot signal or in a decision-directed fashion. The 
estimate could then be fed back to the cognitive radio. 

• Obtaining g: If the secondary base-station synchronizes to the primary radio's pilot 
signal, it could estimate g during the time the cognitive radio is in its silent "listen- 
ing" phase and then feed this estimate back to the cognitive radio. Alternatively, 
if the cognitive radio reveals to the secondary base-station the code used by the 
primary radio, the secondary base-station could use the silent "listening" phase 
to decode a few symbols transmitted by the primary radio thereby estimating the 
parameter g. 

We can express the received discrete-time base-band signal at the primary base-station 
at time sample m as 



Yp[m] = pXp[m] + f\la^Xp[m - /J + ^totaiN; (37) 

where Ztotai["^] = fXc[m — Ic] + Zp[m] is the aggregate noise. The integer Ic accounts for 
the delay incurred while the cognitive radio "listens" and decodes the primary codeword 
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before it transmits its own signal. This equation essentially describes a time-invariant 
two-tap ISI channel for the primary transmission, hence we can apply a Rake receiver (in 
the case the primary system uses direct-sequence spread-spectrum) or transmit-receive 
architectures such as OFDM^° to extract both a diversity gain of two and a power gain 
of |pp-Pp + l/paPc at the primary base-station (see, for instance. Chapter 3 of ^^, and 
references therein). Given a e [0, 1], the rates achievable by the primary and cognitive 
users using such a scheme are given by 



0<i?, < log 1+ J V"'^':;; , (38 



/ |cP(l-a)PA , ^ 

0<Rc < log(l+ ' ' ^^ ^ M . (39) 

In order to avoid causing interference to the primary user, the following equation must 
be satisfied: 

\p\'P, + \f\'aP^ ^ IpIP, 
N,+ \mi-a)P, N, ' ^ ' 

If the cognitive radio tunes its parameter a such that 

this condition will be satisfied, hence Rp = R* Expression ()4H) confirms the intuitive 
notion that, if the primary system is operating at high SNR, the cognitive radio should 
not interfere with it, i.e., a should be close to one. 

From (jlH), we see that, in order to design the optimal a, the cognitive radio only 
needs to know the received SNR of the primary transmission at the primary base-station: 
\p\'^Pp/Np. If the primary system uses a good (capacity-achieving) AWGN channel code 
and the cognitive radio knows this, the cognitive radio can easily compute an estimate of 
this received SNR since it knows the rate at which the primary user is communicating, 
Rp-. This estimate is simply given by e^p — 1. Thus, an immediate benefit of this scheme 
is that the primary base-station need not feed-back the parameters / and p at all: The 
cognitive radio can perform completely autonomously. 

Though expression (PT|) does not depend on |/|, we can see that PUJ) can approxi- 
mately be satisfied even with a = when |/p is very small. Since the cognitive radio 
has no information about |/| and, in practice, may not even be able to obtain \p\'^Pp/Np 
(if the primary system is not using a good AWGN code), a natural way for the cognitive 
radio to enter the spectrum of the primary would be by slowly ramping up its power 

^"^Thc primary base-station would most likely already employ one of these schemes as a means of 
dealing with the multi-path point-to-point channel between the primary radio and itself. In the context 
of OFDM, however, the cyclic prefix would have to be long enough to account for the extra delay-spread 
introduced by the cognitive radio's transmission. 
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Pc from and decreasing a from 1 while simultaneously listening for the Automatic 
Repeat Request (ARQ) control signal from the primary base-station. Once this signal 
is detected, the cognitive radio would either slightly decrease Pc or increase a until the 
primary base-station stops transmitting ARQs^^. 



5.5 Obtaining the channel-state information 

In order to implement the optimal communication scheme of Costa coding and beam- 
forming (jSH), the cognitive radio must obtain estimates of p and / from the primary 
base-station. We present the following simple algorithm for estimation and feedback of 
these parameters: 



1. At first, the cognitive user is silent and the primary base-station broadcasts the 
current estimate of p, call it p, along with the primary user's ID on the control 
channel to which the cognitive radio is tuned. The primary base-station is assumed 
to be able to track p by either using a pilot signal or in a decision-directed fashion. 
Thus, the cognitive radio can obtain p. 

2. Upon entering the system and decoding the message of the primary user in its 
vicinity, the cognitive radio simply performs amplify-and-forward relaying of the 
primary codeword: 

(42) 



3. The primary base-station receives 

p + fJyJ^P + ^P^ (43) 

hence it can compute an estimate, h, of the overall channel gain ip + f\/^ ) as it 
decodes nip. 

4. The quantized version of h is then broadcast on the control channel along with the 
given primary user's ID. 

5. The cognitive radio picks up this information from the control channel and then 
computes h — p. 




6. The quantity h—p is an estimate for f ^/PjPp which is then multiplied by ^JPp/ P^ 
to obtain an estimate for /. 



^^This scheme is analogous to the power control mechanism used in CDMA systems. 
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Note that it is possible that p + f ^ Pf-jP^ < \p\m step 3 above. In this case the primary 

system would momentarily not be able to support the requested rate oi\og{l + \p\'^Pp/Np) 
and an Automatic Repeat Request (ARQ) would be generated by the primary base- 
station. However, by this time, the cognitive radio would have already obtained the 
estimate of / and the next (repeated) transmission would be guaranteed to be successful. 



A Proof of the converse part of Theorem 14.11 



First we observe that the rate- region specified in Theorem 14. II is a convex set in Proposi- 
tion lD.ll We will use the following standard result from convex analysis (see, for instance, 
|15j ) in the proof of the converse. 



Proposition A.l A point R* = {R*R*) is on the boundary of the a capacity region if 
and only if there exists a fi > such that the linear functional fiRp + Re achieves its 
maximum, over all {Rp,Rc) in the region, at R*. 



A.l The /i < 1 case 

For convenience, we will consider a channel whose output at the primary receiver is 
normalized by a, i.e., a channel whose input-output single- letter equations are given by 

f; t' }lx; + x: + -z;, (44) 

^ a ^ a ^ 

Y^ = bX; + X:! + Z^. (45) 

Note that the capacity region of this channel is the same as that of the original channel 
(jlj, (0) since normalization is an invertible transformation. 

Suppose that a rate pair {Rp, Re) is achievable, in the sense of Definition 14.21 for 
the (1, a, 6, 1)-IC-DMS. Assuming that the messages {mp,mc) are chosen uniformly and 
independently, we have, by Fano's inequality, H{mp\Yp) < nep^n and H{mc\Y^) < nes,n, 

where ep,„ — *> and es,n ^ as Pe,p — > 0, Pe^ — > 0, respectively. We start with the 
following bound on nRp-. 



^ as PtJ -^ 0, P^i> -^ 0, respec 


nRp = 


H{mp), 


= 


I{mp-Y;) + H{mp\Y^), 


ib) 
< 


I{mp; Y^) + ntp^n, 


= 


h{Y;)-h{Y^mp)+nep,n 



(46) 
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where (a) follows since rrip and rric are uniformly distributed on {1, 2, . . . , 2"^^} and 
{1,2,..., 2"-^p} respectively, (b) follows from Fano's inequality. Also, we have that. 



nRc = H{mc), 



H{m^) + if (me|y;", rup) - iJ(me|F,", m. 



C\-L S 1 '"'P)) 



I{m^; y;"|mp) + H {m,\Y^'' , rup) , 



< I{mc;Y^\mp) +ne 



(a) 

= h(Y^\mp) - h(Y^\mp, rUc) + nes,„, 

< KY^lmp) - hiV^lmp, m„ X;, X^) + nes,n, 

^^ h{Y:'\mp)-h{Z:)+nes,n, (47) 

where (a) follows from Fano's inequality and the fact that conditioning does not increase 
entropy, (6) follows from the fact that conditioning does not increase entropy, and (c) 
follows from the the fact that Z^ is independent of {rrip, rric) and hence also of {X^, X^). 

Let Z" be a zero mean Gaussian random vector, independent of {X^, X", Z^, Z^) and 
with covariance matrix {\ — 1)I„. Then, we can write 

h{Y;\mp) 



(a) 


h{Y;mp,x;), 


(b) 




= 


h(x: + -z;\mp,x;'' , 


W 


h{x: + z: + z-mp,x;), 


(d) 


h{x: + z: + z^^\mp), 


(l) 


h{Y'' + Z^\mp), 



(48) 

where (a) and [d) hold since X" is the output of a deterministic function of rrip, (b) 
holds because translation does not affect entropy, (c) follows from the fact that Gaussian 
distributions are infinitely divisible and from the definition of Z" and (e) follows from 
the definition F" = X" + Z". By similar reasoning, we can write 

h{Y;\mp) = h{Y''\mp). (49) 
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Combining the bounds in ()46|) and (@7j), we get 

= fih{Yp + hiXJ'lmp) - /i/i(y;"|mp) - ^ log(27re) + finep^n + nes,n, 

^^ /i/i(17) + h{Y^\mp) - fih{Y^ + Z'^lrup) - | log(27re) + /ine^,, + ne,,, 

< fih{Y;) + h{Y'^\mp) - ^ log (^e^(^"l™-) + e^'^^^") 

ft 

-- log(27re) + finep^n + ne^^n, (50) 



where (a) follows from the fact that Z" ~ A/'(0,I„), (6) follows from equalities 
and (J49|l . (c) follows from the conditional version of the Entropy Power Inequality (see 
Proposition ID.2|1 . 



Let Xl denote the first j — 1 components of the vector X" with the understanding 
that Xf is defined to be some constant and let Xj denote the j-th component. We can 
upper-bound h(Y"'\mp) as follows: 

/i(y"|mp) = /i(F"|mp,X;), 

n 

^^ J2h{Y,\mp,Yr\Xp„XJ,-'), 



< X^ilog 27re E[r. 



,.,2^"^V""A"^' ^[^P.] 



n 



5^ilog(27re((l-a,)P,,, + l)), (51) 

i=i 

< -log(27re((l-a)P, + l)), (52) 

where (a) follows from the chain rule and (6) follows from the fact that conditioning 
does not increase entropy, and (c) follows from Lemma ID. 11 Equality id) follows from 
the following argument: Since jointly Gaussian Xpj, l^j achieve equality in (c) (by 
Lemma ID.1|) , we can without loss of generality, let 



Xcj - Xcj + xJoij^^Xpj, (53) 



where Xcj ~ A/'(0, (1 — a;j)Pc,j) is independent of Xpj and 

nnRc iynRp 



P 'M _ 



ir: XI ^i' ^P'j' ~ ^^ 5Z ^p.i- '^^^^ 
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The parameter aj G [0, 1] is chosen so that the resulting covariance Kx .x^ ,k, ,y is 
the same as that induced by the code. Inequahty labeled with (e) follows from Jensen's 
inequahty, by choosing a G [0, 1] such that 



aP, 



1 " 



n . 



and from the fact that the power constraint ||X"|p < nPc implies that ^ Yl^=i ^cj = Pc- 
Similarly, we can upper bound h{Yp) as follows: 

n 



n 



ib) " 



i=i 






< X^llog(27reE[y;y), 

E 2 ^°s (^ (^P. + '^v^^Pp.Pc, + ^c,, + 1) J 



(<=) n, /27re 

o log — 

2 V a 



< -log(-^((v/Pp+V«^e)' + (l-«)Pc + l) ), (56) 



where (a) follows from the chain rule and (6) follows from the fact that conditioning does 
not increase entropy, (c) holds since the Gaussian distribution maximizes the differential 
entropy for a fixed variance, {d) follows from the same argument as in (|3T| and (e) comes 
from Jensen's inequality applied to the log(-) and the ^/^ functions. 

Let f{x) = X — ^ log ( 6"^ + C"'**-^"-' ) over x G M. Then, we can express the bound 
on our linear functional in ()50|) as 

n{fiRp + R,) < ^/i(Fp") + /(/i(y'^|mp))--log(27re)+/inep,„ + ne,,n. (57) 

Observe that as long as yU < 1, f{x) is increasing. Hence we can obtain a further upper 
bound by substituting inequalities (J52|l and (|56|) into (fSTj) : 



22 



n{iiRp + R,) < ;i^logr^((v^+v/^)' + (l-«)Pc + l)') (58) 

+f i^- log (27re ((1 - a)P, + 1)) j - - log(27re) + fine^^n + ne,,„(59) 

= /i^log(^((v^+v^)^ + (l-a)Pe + l)) (60) 



Tt Th ( ( 1 

-- log (27re ((1 - a)P, + 1)) - ^- log ( 27re ( (1 - a)P, + - 



Tl 

-- log(27re) + iine^^n + nes,n, (62) 



where (d) follows from the fact that 



/(^) = a;-^log(e^ + et^(^")), (63) 



^log(e^ + 2.e(l 



x-^log(en^ + 27re(--l) ), (64) 



which holds since Z" is zero mean Gaussian with covariance [\ — l) I. 

Grouping together the /x-terms, dividing by n and letting n -^ oo, we get that 



Let a^ denote the maximizing a G [0, 1] for a given /x < 1 in the above expression. Then, 
we can write 



,R,..K<\ log (l . 'f:^^) + i log (1 + (1 - aji-j ^ (66, 
Hence we have established the converse of the theorem for yU < 1. 



A. 2 The IX > \ case 

A. 2.1 Proof outline 

Suppose that "genie A" gives the message m^ to the cognitive receiver. We will refer to 
this channel as the IG-DMS(A). The capacity region of the IG-DMS(A) must contain the 
capacity region of the original IG-DMS. 

Proposition A. 2 The capacity region of the {l,a,0,l)-IC-DMS(A) is identical to the 
capacity region of {l,a,b, l)-IC-DMS(A) for every 6 G M and every a ^M.. 
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Proof: Since rup is known at the secondary receiver along with the primary encoding rule 
Cp, the secondary receiver of the (l,a,0, l)-IC-DMS(A) can form bX^ and add it to its 
received signal YJ^. The result is statistically identical to the the output at the secondary 
receiver of the (1, a, b, l)-IC-DMS(A). Thus the capacity region is independent of b. D 

This proposition allows us to set 6 = without loss of generality in any IC-DMS(A). 

Now suppose that "genie B" gives rric to the primary transmitter of the (l,a, 0, 1)- 
IC-DMS(A). We will refer to this channel as the (1, a, 0, 1)-IC-DMS(A,B) and we note 
that its capacity region must contain the capacity region of the original (1, a, b, 1)-IC- 
DMS as well as that of the IC-DMS(A). Observe that this channel is equivalent to a 
broadcast channel with two antennas at the transmitter and one antenna at each of 
the receivers (2x1 MIMO EC channel) with per-antenna power constraints but with 
additional knowledge of rUp at the secondary receiver. 



Genie A 



Genie A 



{nip}^^ 



a 
J) 




{rUc, nip] 



Genie B 




{nic, nip} 



1 



{mc, vfip} ^ — ^ ^{mp} {nic, nip}^ — ^ '-{nip} 



(La,fe,l)-IC-DMS 



(l,a,0,l)-IC-DMS(A) 



(l,a,0,l)-IC-DMS(A,B) 



Figure 4: The (1, a, 6, 1)-IC-DMS, the (1, a, 0, l)-IC-DMS(A) and the (1, a, 0, 1)-IC- 
DMS(A,B) channels and the relationships between their capacity regions. 



Thus, if we can show that the rates achieved by our proposed scheme for the (1, a, b, 1)- 
IC-DMS (given by ^ and ^) are optimal for the (l,a,0, 1)-IC-DMS(A,B), then we 
are done. To this end, we will first define a sequence of channels - each of which has a 
capacity region that includes the capacity region of the (l,a,0, 1)-IC-DMS(A,B) - such 
that the rates P^ and (^3)) are optimal in the limit. 



A.2.2 The aligned (1, a, 0, 1)-IC-DMS(A,B): The achievability 

Consider the following modification of the (1, a, 0, 1)-IC-DMS(A,B): Add one antenna at 
each of the receivers so that the input-output relationship becomes 



'1 


a 


1 





e 


1" 





1 



x + z 



P' 



x + z. 



(67) 
(68) 
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where e > and a 7^ 0. The vectors Zp and Z^ are distributed according to J\f{0, S^ 
(their cross-correlation is irrelevant), where 



1 
M 



(69) 



for some M > 0. As in the original (1, a, 0, 1)-IC-DMS(A,B), the message rrip is known at 
the secondary receiver. Clearly, the capacity region of this channel contains the capacity 
region of the (1, a, 0, 1)-IC-DMS(A,B). We shall refer to this genie-aided MIMO BC 
channel as the aligned (1, a, 0, 1)-IC-DMS(A,B) in what follows. 

Let Hp and H^ denote the matrices pre-multiplying the transmit vector X in (jHTjl 
and (jEHI), respectively. Each coordinate of the vector X G M^ represents the symbol on 
each of the antennas and the constraint on X can in general take the form E[XX"^] ^ Q 
for some positive semi-definite covariance constraint Q ^ 0. Let the transmitted vector 
(at any time-sample) be of the form 

X = XpiUpi + Xp2Up2 + XclUcl + Xc2Uc2, (70) 

where Upi,Up2 G M^ and Uci,Uc2 G M? are the so-called signature vectors and symbols 
Xpi,Xp2 and Xci,Xc2 are i.i.d. A/'(0, 1). 

In order to emulate the per-user individual power constraints of the IC-DMS, we 
impose the per-antenna constraints (E[XX'^])ii < Pp and (E[XX^])22 < Pc on the 
achievable strategies in MIMO BC channel. We let 

Ep = UpiuJ'i + Up2uj2, (71) 

Sc = UclU^i + Uc2Uc2, (72) 

so that, by the independence of Xpi, Xp2, X^i and Xc2, the constraint can be expressed 
as (Ep + Ejn < Pp and (Sp + S,)22 < Pc- 

Substituting the expression for X given in (fTOj) . the channel equations become 

Yp = Hp(XpiUpi + Xp2Up2) + Hp(XciUci + Xc2Uc2) + Zp, (73) 

Y, = H,(XpiUpi+Xp2Up2) + H,(X,iu,i + X,2U,2) + Z,. (74) 

Consider the following encoding scheme: first choose Xpi and Xp2 to be independent 
and distributed according to A/'(0, 1), and then perform Costa precoding to encode the in- 
formation in (Xci,Xc2) treating the interference Hs(XpiUpi + Xp2Up2) as side-information 
known at the transmitter^^. The rates achievable with such a scheme are: 



i?p = i?p(S;, S:) ""^ -log|l+(I + S;iHpS:H^)-X"'HpS;H^|, (75) 

>>-c/ 2 



i?e = i?c(S!, S:) ^^ ilog|l + SjiH,S:Hj|, (76) 



^^ Costa's scheme is a block-coding scheme and, strictly speaking, encoding is performed on the vector 
{X^,,X^^) given X;^ and X;^. 
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where S* and S* are the solutions of 



arg max ui?„(S„, Sc) + -Rc(S„, S^ 

(Sp,Se)G5{Pp,Pc) 



(77) 



dcf 



where /i > 1 and S{Pp, P,) = {Sp ^ 0, S, ^ : (Sp + S,)n < Pp, (Sp + 2^)22 < Pc}- 

Since the per-antenna power constraints must be met with equahty/^ we can, without 
loss of generality, write 



where kp G 



pip Up 

kp aPc 

■(1 - P)Pp k, 

kr. (I - a)Pr, 



-^a(3PpP,,^/^W;P, 
where k^. G 



-JapPpPcX apPpP, 



{n 



(79) 



and /3 G [0, 1], a G [0, 1] and a = l — a, P = l — p. With Sc expressed in this way, we 
obtain 



lim limSr^H.EcHf 



;i-a)Pe (l-a)Pe 




50) 



in ()76|) . Similarly, by direct matrix calculations we get 

1 



lim lim(I + S;^HpS,H^ 



lim limS-^H„S„Hr 



-(l-/3)Pe-afcc 



(l-/3)Pp+2afec+a2(l-a)Pc+l (l-/3)Pp+2afec+a2(l-Q:)P^+n;l N 
1 ^^ ^ 



A/^00 e^O 



2 '-'-P^P'-'-p 



pPp + 2afcp + a^aPc /5Pp + ak 



p I u-'i^p 





Hence, on the one hand we have, by the continuity of Rd^p-, Sc) in M and e, that 

lim limP,(Sp,S,) = -log(l + (1 - a)P,), 



^2) 



^3) 



for any choice of /5 G [0, 1]. On the other hand, we have, by the continuity of Pp(Sp, Sc) 
in M and e, that 



lim limPp(Sp, SJ = - log 1 + 



pPp + 2aA;p + a^aPc 



4) 



M^ooe^o - - ■^' 2"°V~' (l-/5)Pp + 2aA;c + a2(l-a)Pc + iy ■ 
The limiting rate (J8l|l is maximized by choosing (3=1 (and, therefore, fee = 0) and 



kp = ^JaPpPc. Thus, 



Pp JaPpP, 



(85) 
(86) 



P ~ [v^^ip^ «Pc 

^ [0 (l-a)P, 

""^•^If, instead, antenna 1 uses only Pp — 77 power, we can add another antenna with power r\ whose 
signal the receivers can first decode and then subtract off thus boosting at least one of the rates. The 
same applies to antenna 2. 
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which is achieved by simply choosing 



u 



■pi 



v^ 



"^1 [y/il -a)P,_ 



u; 



■p2 



0, u:^ 



0. 



(87) 



Therefore, in the hmit as M — ;► oo and e ^ 0, this scheme achieves the rates given by 
(1211) and ^ in the ahgned (1, a, 0, 1)-IC-DMS(A,B). 



A.2.3 The aligned (1, a, 0, 1)-IC-DMS(A,B): The converse 

Since both Up and H^ are invertible for every e > and a 7^ 0, we can equivalently 
represent this channel by the equations 



X + Z 

x + z. 



P' 



(89) 



The new noise vectors are given by Zp ~ Af{0, H-^S^Hp'^) and Z, ~ Af{0, H-^'S,Hs^)- 
This channel is then exactly in the form of an Aligned MIMO BC channel (AMBC) (see 
[TH] . Section 2), but with nip revealed to the secondary receiver. 



Let YIJ G 



p2X7i 



and Y!" G 



p2X7i 



denote the channel outputs over a block of n channel 



uses. We can upper bound any achievable rate Rp as follows 



nRr, 



Hirup), 



Himp\Y;), 



(a) 



< I{mp; Yp) + nep,„, 



p,n 



as n 



where (a) follows from Fano's inequality with e 
secondary receiver observes the tuple (Y",r77,p), we can write 



(90) 
(91) 

(92) 
00. Noting that the 



nRc = H{mc), 

= H{m,) + H{m,\{Y:,mp)) - H{m,\{Y:,mp)), 

= I{m,;{Y:,mp)) + H{m,\{Y:,mp)), 

(a) 

< /(m^; (Y^, nip)) + nes,n, 



(93) 
(94) 
(95) 

(96) 
(97) 



/(mc;Y"|mp) +ne,,„, 

where (a) follows from Fano's inequality with ?s,n —^ as n —>■ cxo, and (6) follows since 
I{nip;nic) = 0. 

Thus, we can upper-bound the linear functional of the achievable rates as 



fiRp + Rc < -/(mp;Y") + -J(mc;Y^|mp) +/iep,„ + e^,„, 
n ^ n 



f^ 



n 



/^ 



MY;)-^MYp"|mp) + -MY 



m.„ 



n 



n 



n 



/i(Z:)+/i?p,„ + ?,,, (98) 
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where fj, > 1. 

Now, from Proposition 4.2 of fTp], we know that, for every /i > 1, there exists an 
enhanced Ahgned Degraded BC channel (ADBC) which contains the capacity region of 
the AMBC given by ()88|1 and ()89|1 . and for which the maximum of the hnear functional 
nRp + Re, over all {Rp, Re) in the region, is equal to the maximum of the same linear 
functional over the capacity region of the corresponding AMBC (i.e., the two regions 
meet at the point of tangency). Due to the degradedness, we can write the channel 
outputs of the enhanced ADBC as 

Y," = X" + Z^, (99) 

Y; = Y," + Z^, (100) 

(101) 

where the matrices Z" and Z" are constructed such that their columns, denoted by Z^ 
and Zp, are independent, zero-mean Gaussian with covariances satisfying Ti^,^ :< Sg; 
and Sz^ + Sg ^ S^ (sse proof of Proposition 4.2 of ^Hl for how to construct them). 
Hence, for this enhanced ADBC, we can write (j^Hj) as 



fiRp + Re < -h{Y;) + h{Y:\mp)-^h{Y;\mp)--h{Z:)+fien 
n n n ^ n 

= ^h{Y^ + Z") + -h{Y:\mp) - ^h{Y: + Z"|mp) - -^1^1) + /xe„, 
n ^ n n ^ n 

fil^ I ^ I'AUpj — /iiog I e^" ^ °' '" -t- e^" 

1 



< ^/i(Y") + -/i(Y"|mp) - ^log ('e^'^(^?|m,) ^ g|,Mz^')' 

-/i(Z^)+/ie„,(102) 



where we have used the conditional version of the vector Entropy Power Inequality (see 
Proposition ID. 1|) in the last step. 



The key property of the this enhanced ADBC is that the upper bound ()102|) is maxi- 
mized by choosing the input X to be Gaussian, i.e., the vector EPI is tight (see proof of 
Theorem 3.1 of 19j). Hence, an optimal achievable scheme for this ADBC is the Costa 
precoding strategy^"^ that is described in Section IA.2.21 The largest jointly achievable 
rates are given by 

Rp = Rp{^;,K), (103) 

Re = Rem,K) (104) 

where -Rp(S*, S*) and -Rc(S*, S*) are as given by dZHl) and ((ZEI), respectively. 

Since this scheme is also achievable for the AMBC, the capacity region of the ADBC 
and AMBC are identical (see Theorem 4.1 of J19j). Moreover, it is obvious that this 
scheme is also achievable for the AMBC with additional knowledge oirup at the secondary 

^''Note that for the ADBC a simple superposition scheme is also optimal. 
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receiver: The knowledge of nip is simply ignored by the receiver. Hence, this scheme is 
optimal for the aligned (1, a, 0, 1)-IC-DMS(A,B) (as defined by ^ and §^) with ^ > 1 
as well. 

Since the Pareto-optimal (for /x > 1) rates for the limiting (as M — ;> oo and e ^ 0) 
aligned (1, a, 0, 1)-IC-DMS(A,B) exactly match the rates ()24j] and ()25j) achievable in the 
original (1, a, b, 1)-IC-DMS channel, and since the capacity region of the (1, a, b, 1)-IC- 
DMS is contained in the capacity region of the aligned (1, a, 0, 1)-IC-DMS(A,B) for any 
M, e > 0, we have completed the proof of the converse part of Theorem 14. II for yU > 1. 



B Proof of Corollary 14.11 



The proof of this Corollary follows from Theorem 14. II and Lemma fP. 21 In particular, we 
observe that the converse to Theorem 14. II for /i > 1 (see Section rA.2|l holds for any a > 
and 6 G M. However, from Lemma fP. 21 we see that the choice a = 1 in ()24|1 and (f^H|) is 
optimal for any a > 1, as long as /i > 1. Hence the corollary is proved. D. 

Remark: This result implies that, for any a > 1, 6 G R and yU > 1, the linear functional 
fiRp + Re is maximized at {Rp, Re) = (Csum(a), 0). Hence, for a > 1, the entire capacity 
region is parametrized hj fi < 1, for any b E'R. 



C Proof of the converse part of Theorem 14.3 



Let "genie B" disclose rUc to the primary transmitter, thus getting a 2 x 1 MIMO BC chan- 
nel with per-antenna power constraints. The input-output relationship for this channel 
can be written as 

Yp = hJX + Zp, (105) 

y, = hjx + z„ (106) 

where hp = [1 a]"^ and h^ = [b 1]^. We choose /i < 1 in the linear functional fiRp + Re 
and recall that the optimal transmission vector X is Gaussian and given by ()70p and the 
optimal encoding strategy is to generate Xp by Costa preceding for h^ (XdUd -|-Xc2Uc2) 
(see [ini)- Consequently, in place of (f75j) and ([7^1) . we get, respectively, 

R, = r^{b;,-bi) S iiog(i + hJ-s;hj, (io7) 

fl.^4(E;.E:) S i.og(l + ^^|a^). (108) 
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where S* and S* are the solutions of (f77j) but with /i < 1. Substituting the covariance 
matrices ^ and ^ into (fTITTj) and (ITnH|) . we get 

4(Ep, S,) = ^p(/3, a, fcp, a, fe) = ^ log (l + /?Pp + 2afcp + aa^P,) , (109) 

P,(S,,S.)=P,(Aa,A:„a,6) = ^ ^og ^1 + ^ ^ ,.^p^ ^ ^^ + c^^c ^^ 

The expression in flllOp is maximized by choosing kc = a/(1 — /3)(1 — a)PpPc, i.e., making 
Xlc unit rank. If 6 = it is clear that /? = 1 and /Cp = ^JaRpPf. maximizes the linear 

functional ^Rp{P, a, kp, a, h) + -Rc(/?, a, ^p, «, &). In general, we would like to find the set 
of all values of h for which (3 = 1 and kp = ^aPpPc are optimal. For such values of 6, 
we then have 

1 



Pp(Sp,S,) = -log(l+(^v^Pp + aVaP,j j, (111) 

^c(S„Sj = ilog(l + -^^—^^^^-2), (112) 

which exactly match the achievable rates given in Lemma (4.21 To this end, let B{fi,a) 
denote the set of all 6 > such that the function 

max fiRp (P, a, kp, a, b) + RdP, a, kp, a, b) (113) 

0<o<l 

is maximized, over all P E [0, 1] and kp E [—^jPaPpPc, y^PaPpPc], by choosing /3 = 1 and 

kp = ^JaPpPc- We let &max(Ai, o) = '^^^bl^B(^l,a) to obtain the statement of the theorem. 
Appealing to the remark in the proof of Corollary 14.11 (see Appendix EJ, we observe 
that the boundary of the capacity region in this very-high-interference-gain regime is 
completely parametrized by /x < 1. Hence, we have proved the theorem. 



D Supporting results 

Proposition D.l The rate region specified in Theorem \4.1\ is a convex set. 



Proof: A point R = {Rp, Re) is in the rate region specified in Theorem 14. II if and only if 
there exists a E [0, 1] such that 

0<Pc < ^log(l + (l-a)P,), (114) 



<Pp < 



i log (1 + .'P. + P, H- 2. VSP^) + 1 log (^^^;jji-^) . (115) 
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Suppose that there exist two points R*^^-' = {Rp , Re ) and R*^^-* = {Rp , Re ) that are in 
the region. Let a^^' E [0, 1] and a*-^-* G [0, 1] be their corresponding parameters in ()114j) 
and pi5|) . Then for any A G [0, 1], we have that 

XrW + (1 - X)R(?) < ^log(l + (l-««)P,) + l^log(l + (l-a(2))P,),(116) 



< llog(l + (l-a*)Pe 



;il7) 



where a* = \a^^^ + (1 — A)a*^^^ and the last inequahty follows from Jensen's inequality. 
Similarly, 



AP« + (l-A)Pf < 



^ log ( 1 + a^Pe + Pp + 2aJaWPpPe 



1-A 



log ( 1 + a^Pe + Pp + 2aJa^^)PpPe 



(118) 
(119) 
(120) 

l + a2(l-a(2))pJJ '^^^^) 
< ^ log (l + a^P, + Pp + 2a v/7V^(Av/^+(1- A) V^)) (122) 



+ 



A 

2 °^Vl + a2(l-a«)P, 



1-A, 



{«) 1 



^2 ^°^ V 1 + o?{l - Aa« - (1 - A)a(2))p^ 



(123) 



ib) 1 
< 



i log (l + a-P^ + P, + 2aVJ^^) + i log (^.^-j^i—^z)) 



(a) follows from Jensen's inequality applied to the concave function log(A;i + k2x) (for 
constant ki^k2 > 0) and the concave function log [jttyz;^^) {^or constant A; > 0). In- 
equality (6) follows from Jensen's inequality applied to the square-root function. Hence 
AR'-^^ + (1 — A)R^^'' is in the region as well, hence the region is a convex set. D 



Proposition D.2 (Conditional EPI) Suppose F" G M" and Z" G M" are independent 
random vectors and m G {1,2, .. . , M} (for some M) is independent of Z". Then we 
have that 



h{Y^ + Z^\m) > -log (^e^'^^^-l™) + e^'^^^") 



(125) 
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Proof: 

M 

h{Y'^ + Z'^\m) = ^/i(r" + Z"|m = z)P(m = i), (126) 

1=1 

( ) ^''' 

S 5^-log(e^'^(^"l"=*) + e^(^"))p(m = ^), (127) 

1=1 
> ^log(e^''(^"l-) + e^'^(^")), (128) 



where (a) follows from the classical Entropy Power Inequality (EPI) (see e.g. 0), and 
(6) follows from Jensen's inequality applied to the convex function log(e^^/" + k) (for 
constant k and n). D. 



Lemma D.l Given two zero-mean random variables X and Y with a fixed covariance 
matrix Kxy we have that 

hiY\X) < i log (27re (e[Y'] - ^^) ) , (129) 

with equality when X and Y are jointly Gaussian. 

Proof: Let /3 = ^|^. Then the MMSE estimator of Y given X is given hy Y = (3X. 

h{Y\X) ^^ h{Y-(3X\X), (130) 

(b) 

< h{Y-(3X), (131) 

< ^log(27re(E[(r-/3X)2])), (132) 

where (a) follows from the fact that shifts do not change the differential entropy, (6) 
follows since conditioning does not increase entropy, and (c) follows since the Gaussian 
distribution maximizes the entropy for a given variance. By the orthogonality principle, 
(6) is tight when X and Y are jointly Gaussian and in that case (c) is tight as well. D 

Lemma D.2 

o?s f '°« [' + TwSr) ^ '°« (1 + (1 - "'^J (13^' 

= |log(l+(v^ + oV^ 

for a > 1 and /i > 1. 
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Proof: On the one hand we have that 



1 /(l + a2(l-a)P,+ (v/i^ + av/^)2)'^(l + (l-a)P)V 
< max ^ lo. / (l + ^^(l-")^^ + (v^ + »^/^)r ^ ^37^ 

- max 1 log f(^ + f^-+/- + ^:v(^n ^ (138) 

o<a<i2 ^l (l + a2(i_tt)pjM-i /' ^ ^ 

= I log Tl + (v^ + av^)'") . (139) 



On the other hand, the maximization problem in ()134|) can be lower bounded with 



log ( 1 + (a/^ + cby/P^) ) , by choosing a = 1. Hence the lemma is proved. D 
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